At the end of this Worksheet, you should be able to:
t.test to perform a 1 or 2 sample T Test.Recap on the 1 and 2 sample T Tests
The structure of a hypothesis test is H A T P C.
For a box model contructed around a certain null hypothesis, the observed value of the test statistic is (OV - EV)/SE, where the OV is calculated from the sample and the EV and SE are calculated from the box model.
For a 1 Sample Z or T Test, the null hypothesis is about the population mean, so we consider the Sample Mean, and the test statistic has a Z or T distribution respectively, depending on whether we know the population SD.
| Test | Test Statistic | P-value Curve |
|---|---|---|
| Z | \(\frac{\mbox{observed mean - population mean}}{\mbox{population SD}/\sqrt{n}}\) | Normal |
| T | \(\frac{\mbox{observed mean - population mean}}{\mbox{sample SD}/\sqrt{n}}\) | \(t_{n-1}\) |
t.test() in R, which calculates the test statistics and p-values.
When you finish: Upload your answers to the Lab12 link on Ed.
Explain the structure of a T Test using an annotated diagram, and example.
A study considered caffeine effect’s on endurance, with a double blind, random order administration of caffeine capsules for 9 elite cylists.
The following data is the time to exhaustion after 0 and 13 mg caffeine per kg body weight, with cafdiff representing the effect of caffeine on endurance.
Assume that we know the population SD is 12 mins, from a previous study.
caf0 = c(36.05, 52.47, 56.55, 45.2, 35.25, 66.38, 40.57, 57.15, 28.34) # no caffeine (base-line endurance)
caf13 = c(37.55, 59.3, 79.12, 58.33, 70.54, 69.47, 46.48, 66.35, 36.2) # 13 mg caffeine per kg body weight
cafdiff = caf13- caf0 # This represents the 'effect' of caffeine on endurance.
cafdiff
## [1] 1.50 6.83 22.57 13.13 35.29 3.09 5.91 9.20 7.86
Test the claim that caffeine does not affect endurance - ie the mean cafdiff is 0.
Note:
cafdiff, and we know the population SD = 12.Draw a simple box model, with any population and sample details identified.
Ho: The mean effect of caffeine on endurance (time to exhausation) is 0.
H1: The mean effect of caffeine on endurance is not 0.
Write down any assumptions.
n = 9
ev = 0 # from Ho in box model
se = 12/sqrt(n)
What is the formula for the test statistic?
What is the observed value of the test statistic?
ov = mean(cafdiff)
ts = (ov-ev)/se
ts
## [1] 2.927222
Using the Normal curve curve to model the Mean of the sample, what is the approximate p-value?
2*pnorm(ts,lower.tail=F)
## [1] 0.003420044
What is the conclusion?
A study considered caffeine effect’s on endurance, with a double blind, random order administration of caffeine capsules for 9 elite cylists.
The following data is the time to exhaustion after 0 and 13 mg caffeine per kg body weight, with cafdiff representing the effect of caffeine on endurance.
caf0 = c(36.05, 52.47, 56.55, 45.2, 35.25, 66.38, 40.57, 57.15, 28.34) # no caffeine (base-line endurance)
caf13 = c(37.55, 59.3, 79.12, 58.33, 70.54, 69.47, 46.48, 66.35, 36.2) # 13 mg caffeine per kg body weight
cafdiff = caf13- caf0 # This represents the 'effect' of caffeine on endurance.
cafdiff
## [1] 1.50 6.83 22.57 13.13 35.29 3.09 5.91 9.20 7.86
Test the claim that caffeine does not affect endurance - ie the mean cafdiff is 0.
Note: Now, we do not know the population SD, so we will perform a 1 sample T Test.
Draw a simple box model, with any population and sample details identified.
Ho: The mean effect of caffeine on endurance (time to exhausation) is 0.
H1: The mean effect of caffeine on endurance is not 0.
Write down any assumptions.
n = 9
ev = 0 # from Ho in box model
se = sd(cafdiff)/sqrt(n)
What is the formula for the test statistic?
What is the observed value of the test statistic?
ov = mean(cafdiff)
ts = (ov-ev)/se
ts
## [1] 3.252508
Using a \(t_{11}\) curve curve to model the Mean of the sample, what is the approximate p-value?
2*pt(ts,n-1, lower.tail=F)
## [1] 0.01165724
What is the conclusion?
The speedy way to do all this is:
t.test(mu = 0, cafdiff)
##
## One Sample t-test
##
## data: cafdiff
## t = 3.2525, df = 8, p-value = 0.01166
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
## 3.407372 20.010405
## sample estimates:
## mean of x
## 11.70889
Consider the following data on heart rates (beats per minute), for 2 independent groups of Sydney students, collected 20 minutes after the ‘RedBull’ group had drunk a 250ml cold can of Red Bull.
No_RB = 84,76,68,80,64,62,74,84,68,96,80,64,65,66
RB = 72,88,72,88,76,75,84,80,60,96,80,84
Test the claim that Redbull (caffeine) has an effect on heart rate
Note: - We are comparing 2 independent populations, so we will use a 2 sample T Test. - You can use the information given in t.test.
No_RB = c(84,76,68,80,64,62,74,84,68,96,80,64,65,66)
RB = c(72,88,72,88,76,75,84,80,60,96,80,84)
t.test(No_RB, RB, var.equal = T)
##
## Two Sample t-test
##
## data: No_RB and RB
## t = -1.5418, df = 24, p-value = 0.1362
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -13.892538 2.011586
## sample estimates:
## mean of x mean of y
## 73.64286 79.58333
Draw a simple box model with the 2 populations and 2 samples, with any population and sample details identified.
Ho: The mean difference between the heart rates of the 2 populations (with and without Red Bull) is 0.
H1: The mean difference between the heart rates of the 2 populations (with and without Red Bull) is not 0.
Write down any assumptions.
From the R Output, what is the observed value of the test statistic?
From the R Output, what is the p-value?
What is the conclusion?
A1. The 2 samples are independent
Given in context.
A2. The 2 populations have equal spread (SD/variance)
## Loading required package: ggplot2
var.test(No_RB,RB)
##
## F test to compare two variances
##
## data: No_RB and RB
## F = 1.1357, num df = 13, denom df = 11, p-value = 0.8428
## alternative hypothesis: true ratio of variances is not equal to 1
## 95 percent confidence interval:
## 0.334832 3.631266
## sample estimates:
## ratio of variances
## 1.135659
A3. The 2 populations are Normal
Boxplots look fairly symmetric.
require(ggplot2)
p3 = ggplot(RB_data, aes(sample = rate, colour = group)) +
stat_qq() + stat_qq_line() + ggtitle("QQplot")
p3
shapiro.test(No_RB)
##
## Shapiro-Wilk normality test
##
## data: No_RB
## W = 0.90604, p-value = 0.138
shapiro.test(RB)
##
## Shapiro-Wilk normality test
##
## data: RB
## W = 0.97459, p-value = 0.9524